Goto

Collaborating Authors

 model selection consistency


On model selection consistency of penalized M-estimators: a geometric theory

Neural Information Processing Systems

Penalized M-estimators are used in diverse areas of science and engineering to fit high-dimensional models with some low-dimensional structure. Often, the penalties are \emph{geometrically decomposable}, \ie\ can be expressed as a sum of (convex) support functions. We generalize the notion of irrepresentable to geometrically decomposable penalties and develop a general framework for establishing consistency and model selection consistency of M-estimators with such penalties. We then use this framework to derive results for some special cases of interest in bioinformatics and statistical learning.



On model selection consistency of penalized M-estimators: a geometric theory

Jason D. Lee, Yuekai Sun, Jonathan E. Taylor

Neural Information Processing Systems

Penalized M-estimators are used in diverse areas of science and engineering to fit high-dimensional models with some low-dimensional structure. Often, the penalties are geometrically decomposable, i.e. can be expressed as a sum of support functions over convex sets. We generalize the notion of irrepresentable to geometrically decomposable penalties and develop a general framework for establishing consistency and model selection consistency of M-estimators with such penalties. We then use this framework to derive results for some special cases of interest in bioinformatics and statistical learning.



On model selection consistency of penalized M-estimators: a geometric theory

Neural Information Processing Systems

Penalized M-estimators are used in diverse areas of science and engineering to fit high-dimensional models with some low-dimensional structure. Often, the penalties are \emph{geometrically decomposable}, \ie\ can be expressed as a sum of (convex) support functions. We generalize the notion of irrepresentable to geometrically decomposable penalties and develop a general framework for establishing consistency and model selection consistency of M-estimators with such penalties. We then use this framework to derive results for some special cases of interest in bioinformatics and statistical learning.


AdapDISCOM: An Adaptive Sparse Regression Method for High-Dimensional Multimodal Data With Block-Wise Missingness and Measurement Errors

Diakité, Abdoul O., Moreau, Claudia, Bezgin, Gleb, Bhagwat, Nikhil, Rosa-Neto, Pedro, Poline, Jean-Baptiste, Girard, Simon, Barry, Amadou, Initiative, for the Alzheimers Disease Neuroimaging

arXiv.org Machine Learning

Multimodal high-dimensional data are increasingly prevalent in biomedical research, yet they are often compromised by block-wise missingness and measurement errors, posing significant challenges for statistical inference and prediction. We propose AdapDISCOM, a novel adaptive direct sparse regression method that simultaneously addresses these two pervasive issues. Building on the DISCOM framework, AdapDISCOM introduces modality-specific weighting schemes to account for heterogeneity in data structures and error magnitudes across modalities. We establish the theoretical properties of AdapDISCOM, including model selection consistency and convergence rates under sub-Gaussian and heavy-tailed settings, and develop robust and computationally efficient variants (AdapDISCOM-Huber and Fast-AdapDISCOM). Extensive simulations demonstrate that AdapDISCOM consistently outperforms existing methods such as DISCOM, SCOM, and CoCoLasso, particularly under heterogeneous contamination and heavy-tailed distributions. Finally, we apply AdapDISCOM to Alzheimers Disease Neuroimaging Initiative (ADNI) data, demonstrating improved prediction of cognitive scores and reliable selection of established biomarkers, even with substantial missingness and measurement errors. AdapDISCOM provides a flexible, robust, and scalable framework for high-dimensional multimodal data analysis under realistic data imperfections.


Split LBI: An Iterative Regularization Path with Structural Sparsity Chendi Huang

Neural Information Processing Systems

An iterative regularization path with structural sparsity is proposed in this paper based on variable splitting and the Linearized Bregman Iteration, hence called Split LBI. Despite its simplicity, Split LBI outperforms the popular generalized Lasso in both theory and experiments. A theory of path consistency is presented that equipped with a proper early stopping, Split LBI may achieve model selection consistency under a family of Irrepresentable Conditions which can be weaker than the necessary and sufficient condition for generalized Lasso.


Interpretable and Scalable Graphical Models for Complex Spatio-temporal Processes

Wang, Yu

arXiv.org Artificial Intelligence

This thesis focuses on data that has complex spatio-temporal structure and on probabilistic graphical models that learn the structure in an interpretable and scalable manner. We target two research areas of interest: Gaussian graphical models for tensor-variate data and summarization of complex time-varying texts using topic models. This work advances the state-of-the-art in several directions. First, it introduces a new class of tensor-variate Gaussian graphical models via the Sylvester tensor equation. Second, it develops an optimization technique based on a fast-converging proximal alternating linearized minimization method, which scales tensor-variate Gaussian graphical model estimations to modern big-data settings. Third, it connects Kronecker-structured (inverse) covariance models with spatio-temporal partial differential equations (PDEs) and introduces a new framework for ensemble Kalman filtering that is capable of tracking chaotic physical systems. Fourth, it proposes a modular and interpretable framework for unsupervised and weakly-supervised probabilistic topic modeling of time-varying data that combines generative statistical models with computational geometric methods. Throughout, practical applications of the methodology are considered using real datasets. This includes brain-connectivity analysis using EEG data, space weather forecasting using solar imaging data, longitudinal analysis of public opinions using Twitter data, and mining of mental health related issues using TalkLife data. We show in each case that the graphical modeling framework introduced here leads to improved interpretability, accuracy, and scalability.


Thresholded Graphical Lasso Adjusts for Latent Variables: Application to Functional Neural Connectivity

Wang, Minjie, Allen, Genevera I.

arXiv.org Machine Learning

Emerging neuroscience technologies such as electrophysiology and calcium imaging can record from tens-of-thousands of neurons in the live animal brain while the animal is responding to stimuli and behaving freely. Scientists often seek to understand how neurons are communicating during certain stimuli or activities, something termed functional neural connectivity. To learn functional connections from large-scale neuroscience data, many have proposed using probabilistic graphical models (Yatsenko et al. 2015; Narayan et al. 2015; Chang et al. 2019), where each edge denotes conditional dependencies between nodes. Yet, applying such models in neuroscience poses a major challenge as only a small subset of neurons in the animal brain can be recorded at once, leading to abundant latent variables. Chandrasekaran et al. (2012) termed this the latent variable graphical model problem and proposed a convex program to solve this. While conceptually attractive, this approach poses several statistical, computational and practical challenges, discussed subsequently, for the task of learning functional neural connectivity from large-scale neuroscience data. Because of this, we are motivated to consider an incredibly simple solution to the latent variable graphical model problem: apply a hard thresholding operator to existing graph selection estimators. In this paper, we study this approach showing that thresholding has more desirable theoretical properties as well as superior empirical performance.


Machine Learning Advances for Time Series Forecasting

Masini, Ricardo P., Medeiros, Marcelo C., Mendes, Eduardo F.

arXiv.org Machine Learning

In this paper we survey the most recent advances in supervised machine learning and high-dimensional models for time series forecasting. We consider both linear and nonlinear alternatives. Among the linear methods we pay special attention to penalized regressions and ensemble of models. The nonlinear methods considered in the paper include shallow and deep neural networks, in their feed-forward and recurrent versions, and tree-based methods, such as random forests and boosted trees. We also consider ensemble and hybrid models by combining ingredients from different alternatives. Tests for superior predictive ability are briefly reviewed. Finally, we discuss application of machine learning in economics and finance and provide an illustration with high-frequency financial data.